An Algorithm-by-Blocks for SuperMatrix Band Cholesky Factorization
نویسندگان
چکیده
We pursue the scalable parallel implementation of the factorization of band matrices with medium to large bandwidth targeting SMP and multi-core architectures. Our approach decomposes the computation into a large number of fine-grained operations exposing a higher degree of parallelism. The SuperMatrix run-time system allows an out-of-order scheduling of operations that is transparent to the programmer. Experimental results for the Cholesky factorization of band matrices on two parallel platforms with sixteen processors demonstrate the scalability of the solution.
منابع مشابه
A Scalable Parallel Block Algorithm for Band Cholesky Factorization
In this paper, we present an algorithm for computing the Cholesky factorization of large banded matrices on the IBM distributed memory parallel machines. The algorithm aims at optimizing the single node performance and minimizing the communication overheads. An important result of our paper is that the proposed algorithm is strongly scalable. As the bandwidth of the matrix increases, the number...
متن کاملTask Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout
We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in...
متن کاملRobust Approximate Cholesky Factorization of Rank-Structured Symmetric Positive Definite Matrices
Given a symmetric positive definite matrix A, we compute a structured approximate Cholesky factorization A ≈ RTR up to any desired accuracy, where R is an upper triangular hierarchically semiseparable (HSS) matrix. The factorization is stable, robust, and efficient. The method compresses off-diagonal blocks with rank-revealing orthogonal decompositions. In the meantime, positive semidefinite te...
متن کاملDDtBe for Band Symmetric Positive Definite Matrices
We present a new parallel factorization for band symmetric positive definite (s.p.d) matrices and show some of its applications. Let A be a band s.p.d matrix of order n and half bandwidth m. We show how to factor A as A =DDt Be using approximately 4nm2 jp parallel operations where p =21: is the number of processors. Having this factorization, we improve the time to solve Ax = b by a factor of m...
متن کاملTask Scheduling using Block Dependency DAG of Block-Oriented Sparse Cholesky Factorizationy
The block-oriented sparse Cholesky factorization decomposes a sparse matrix into rectangular sub-blocks, and handles each block as a computational unit in order to increase data reuse in a hierarchical memory system. As well, the factorization method increases the degree of concurrency with the reduction of communication volumes so that it performs more eeciently on a distributed-memory multipr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008